Analysis of Unannotated Equine Transcripts Identified by mRNA Sequencing

نویسندگان

  • Stephen J. Coleman
  • Zheng Zeng
  • Matthew S. Hestand
  • Jinze Liu
  • James N. Macleod
چکیده

Sequencing of equine mRNA (RNA-seq) identified 428 putative transcripts which do not map to any previously annotated or predicted horse genes. Most of these encode the equine homologs of known protein-coding genes described in other species, yet the potential exists to identify novel and perhaps equine-specific gene structures. A set of 36 transcripts were prioritized for further study by filtering for levels of expression (depth of RNA-seq read coverage), distance from annotated features in the equine genome, the number of putative exons, and patterns of gene expression between tissues. From these, four were selected for further investigation based on predicted open reading frames of greater than or equal to 50 amino acids and lack of detectable homology to known genes across species. Sanger sequencing of RT-PCR amplicons from additional equine samples confirmed expression and structural annotation of each transcript. Functional predictions were made by conserved domain searches. A single transcript, expressed in the cerebellum, contains a putative kruppel-associated box (KRAB) domain, suggesting a potential function associated with zinc finger proteins and transcriptional regulation. Overall levels of conserved synteny and sequence conservation across a 1MB region surrounding each transcript were approximately 73% compared to the human, canine, and bovine genomes; however, the four loci display some areas of low conservation and sequence inversion in regions that immediately flank these previously unannotated equine transcripts. Taken together, the evidence suggests that these four transcripts are likely to be equine-specific.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transcriptomic landscape of breast cancers through mRNA sequencing

Breast cancer is a heterogeneous disease with a poorly defined genetic landscape, which poses a major challenge in diagnosis and treatment. By massively parallel mRNA sequencing, we obtained 1.2 billion reads from 17 individual human tissues belonging to TNBC, Non-TNBC, and HER2-positive breast cancers and defined their comprehensive digital transcriptome for the first time. Surprisingly, we id...

متن کامل

ENCODE Tiling Array Analysis Identifies Differentially Expressed Annotated and Novel 5′ Capped RNAs in Hepatitis C Infected Liver

Microarray studies of chronic hepatitis C infection have provided valuable information regarding the host response to viral infection. However, recent studies of the human transcriptome indicate pervasive transcription in previously unannotated regions of the genome and that many RNA transcripts have short or lack 3' poly(A) ends. We hypothesized that using ENCODE tiling arrays (1% of the genom...

متن کامل

Identification and Classification of New Transcripts in Dorper and Small-Tailed Han Sheep Skeletal Muscle Transcriptomes

High-throughput mRNA sequencing enables the discovery of new transcripts and additional parts of incompletely annotated transcripts. Compared with the human and cow genomes, the reference annotation level of the sheep genome is still low. An investigation of new transcripts in sheep skeletal muscle will improve our understanding of muscle development. Therefore, applying high-throughput sequenc...

متن کامل

Large-scale identification of novel transcripts in the human genome.

Although the sequencing of the human genome has been completed, the number and identity of genes contained within it remains to be fully determined. We used LongSAGE to analyze 660,357 human transcripts from human brain mRNA and identified expression of 17,409 known genes and >15,000 different transcripts that were not annotated in genome databases. Analysis of a subset of these unannotated tra...

متن کامل

Super-resolution ribosome profiling reveals unannotated translation events in Arabidopsis.

Deep sequencing of ribosome footprints (ribosome profiling) maps and quantifies mRNA translation. Because ribosomes decode mRNA every 3 nt, the periodic property of ribosome footprints could be used to identify novel translated ORFs. However, due to the limited resolution of existing methods, the 3-nt periodicity is observed mostly in a global analysis, but not in individual transcripts. Here, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013